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1 Introduction 



The question of whether private schools provide better education than public schools is at 
the center of the current national debate over the role of vouchers, charter schools, and other 
reforms that increase choice in education. Since Catholic schools account for about two 
thirds of private school enrollment in the U.S., assessing the effectiveness of Catholic schools 
is an important part of the assessment of private schoohng. This is especially true in light of 
a recent U.S. Supreme Court decision that permits students to use pubhcly financed vouchers 
to pay tuition at religious schools. Simple cross tabulations or multivariate regressions of 
outcomes such as high school graduation and college enrollment typically show a substantial 
positive effect of Catholic school attendance. However, the positive effects of Catholic school 
attendance may be due to nonrandom selection into Catholic schools that induces spurious 
correlations between Catholic school attendance and uimieasured family characteristics that 
are favorable to education. 

All serious studies of public/ private school differences acknowledge this sample selection 
problem and most wrestle with it in one way or another.^ In the absence of experimental 
data, the main option is to find a nonexperimental source of variation Z{ in Catholic school 
attendance that is exogenous with respect to the outcome under study. The problem, 
however, is that most student background characteristics that influence schooling decisions, 
such as income, attitudes, and education of the parents, are likely to influence outcomes 
independently of the school since they are hkely to be related to other parental inputs. 
These variables must be included in the vector of controls Xi to avoid omitted variables 
bias. Characteristics of private and pubhc schools such as tuition levels, student body 
characteristics, or school pohcies are likely to be related to the effectiveness of the schools 
and so are poor candidates for excluded instruments. 

Two influential papers provide potential instrumental variables. Evans and Schwab 
(1995) treat Cathohc schoohng as exogenous in much of their analysis, but also present 
estimates that rely in part on the assumption that religious affiliation affects whether a per- 
son attends a Catholic school but has no independent effect on the outcome under study. 

^ A few examples of early studies of Catholic schools and other private schools are Coleman et al (1982), 
Noell (1982), Goldberger and Cain (1982), Alexander and Pallas (1985), and Coleman and Hoffer (1987). 
Recent studies include Evans and Schwab (1993,1995), Tyler (1994), Neal (1997), Figlio and Stone (1998), 
Grogger and Neal (2000), Sander (2001), and Jepsen (forthcoming). Mumane (1984), Witte (1992), Chubb 
and Moe (1990) and Cookson (1993), and Sander (2001) provide overviews of the discussion and references 
to the literature. 
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specifically they use a dummy variable for affiliation with the Catholic church (Ci) as their 
excluded variable. Some support for this assumption is evidenced by the fact that being 
Cathohc is strongly correlated with Cathohc school attendance, while Cathohcs are not far 
from national averages on many socio-economic indicators. Evans and Schwab find a strong 
positive effect of Catholic school attendance on high school graduation and on the proba- 
bihty of starting college. However, as Murnane (1985), Tyler (1994), and Neal (1997) note, 
being Cathohc could well be correlated with characteristics of the neighborhood and family 
that influence the effectiveness of schools.^ 

Neal (1997) uses proxies for geographic proximity to Cathohc schools as an exogenous 
source of variation in Cathohc high school attendance (see also Tyler, 1994). The basic 
assumption is that the location of Cathohcs or Cathohc schools was determined by historical 
circumstances unrelated to unobservables that influence performance in schools. Using data 
from the National Longitudinal Survey of Youth (NLSY), Neal estimates bivariate probit 
models of Catholic high school attendance and high school graduation, in which Catholic 
school effects are identified by excluding whether the person is Cathohc, the fraction of 
Cathohcs in the county population, and the number of Cathohc schools in the county.^ 

The interaction between whether a person is Cathohc and the availabihty of Cathohc 
schools is a natural alternative to using distance or religion separately. It is quite possible 
that proximity to Cathohc schools is related to differences in regional and family characteris- 
tics that have a direct influence on schooling and labor market outcomes, given that Cathohc 
schools are somewhat concentrated by region.^ However, since “tastes” for Cathohc schoohng 

^Neal (1997) points out that one problem with using Ci as an instrumental variable when estimating 
Catholic school effects (as in Neal (1997) and Evans and Schwab (1995)) is that religious identification might 
be influenced by the school type attended. Neither study investigates the issue. In the case of NELS:88 we 
use the parent’s report of religious affiliation while the student is in eighth grade as our religion measure. 
Cross tabulations of differences between the parent’s report and the child’s tenth grade report with whether 
the child attends a Catholic high school suggest that attending a Catholic high school influences the child’s 
report. However, our NELS:88 results are not very sensitive to using the child’s report in place of the parent’s 
eighth grade report. Consequently, our evidence on the importance of this issue is mixed. 

^His results are not sensitive to adding Catholic to the outcome equation. However, in the appendix 
we show that in our data nonlinearities in the effects of religion and family background rather than the 
location variables are the main source of identification when we use Neal’s measures of proximity to Catholic 
schools. Tyler (1994) uses the fraction of students in the school district who attended Catholic schools as 
an instrument. However, Tyler does not allow this variable or other detailed geographic variables to have a 
direct effect on the outcome. Tyler notes that his aggregated measure of school choice is likely to be affected 
by district level variation in family or school characteristics that affect outcomes as well as by distance to 
Catholic schools. For both reasons, his results should be discounted. 

'^Hoxby (1995) discusses geographical concentration by region, much of which is associated with the 
geographic concentration of the Catholic population in the past. 
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depend strongly on religious preference, the interaction between distance {Di) and religious 
affiliation will have an effect on Catholic school attendance that is independent of the sep- 
arate effects of religious afl&liation and distance. In particular, Catholic school attendance 
is hkely to be much more sensitive to distance for Catholics than for non-Catholics. Conse- 
quently, one can control for both rehgious affiliation and for distance from Catholic schools, 
as well as for a set of other geographic characteristics (such as city size, region, labor market 
characteristics, average family income, and pubhc school characteristics), while excluding the 
interaction Ci x Di from outcome models. However, the case that Ci x Di may be a valid 
instrument even if Ci and Di are not is far from bulletproof. Cathohc parents who want 
their children to attend Catholic schools might choose to five near Catholic schools. This 
could lead to a positive or negative bias depending on the relationship between preferences 
for Catholic school and the error component in the outcome equation. Also, past immigra- 
tion patterns and internal migration from city to suburb and across regions may have led 
to differences between Cathohcs and non-Catholics in the correlation between proximity to 
Cathohc schools and observed and unobserved components of family backgroimd. 

In this paper we explore the vahdity of A, and x as exogenous sources of 
variation for identifying the effects of Cathohc schooling on educational attainment and 
achievement. Rehgion and proximity have figured prominently in the literature regarding 
Cathohc schools, but there is a need for a systematic effort to evaluate these measures as 
vahd instrumental variables. We use multiple data sets and methods to perform such an 
evaluation. Our main data set is the National Educational Longitudinal Survey of 1988 
(NELS:88), but we also report results based on the National Longitudinal Study of the High 
School Class of 1972 (NLS-72). For each instrument, we present 2SLS and bivariate probit 
estimates that rely on the particular instrument as the source of identification and compare 
the results to OLS and univariate probit estimates. 

In addition to examining the a priori case for the instruments, the face plausibility and 
precision of the IV estimates, and the consistency across data sets, we assess the quality of 
the instruments in two other ways. The first approach takes advantage of the fact that few 
students who attend public 8th grades attend Catholic high school. This provides some 
justification for using the coefficient on the instrument in a reduced-form outcome equation 
from a sample of public eighth grade attendees in NELS:88 as an estimate of the direct link 
between Catholic religion and the outcome. 
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The second approach uses a methodology introduced in Altonji, Elder and Taber (2001) 
(hereafter, AET) to assess the instrumental variable results. AET’s approach is based on 
the idea of using the degree of selection on observables as a guide to how much selection 
there is on unobservables.^ In an ideal world, the instrument would be randomly assigned 
either by nature or through a controlled experiment. In this case, the instruments would 
be uncorrelated with both the observed and unobserved determinants of the outcome. Short 
of that, the hope in using an IV strategy is that the observed variables that are used as 
controls in the outcome equation are systematically chosen so that the instrumental variable 
has no relationship with the unobserved variables that determine the outcome, conditional 
on the observables. However, as AET argue, major data sets with large samples and 
extensive questionnaires are not designed to address one relatively specific question, such 
as the effectiveness of Catholic schools using a particular IV approach, but rather to serve 
multiple purposes. Because there are a limited number of factors that we know how to 
collect, can afford to collect, and expect to matter for a particular outcome, many relevant 
variables are left out. In such a world, it is prudent to consider an alterative benchmark 
case in which the observed variables are a random subset of the factors that influence the 
outcome rather than the perfect control set given the instrument. This is particularly true 
in the absence of strong prior information about the sources of variation in the instrument. 
AET show that under certain conditions, the regression coefficients relating the instrumental 
variable to the regression index of the observables in the outcome equation and to the error 
term in the outcome equation will be the same. We use their approach to estimate what 
the bias in the IV estimates would be if the assumption of equal selection on observables and 
unobservables were correct. We restrict ourselves to NELS:88 because the calculation only 
makes sense when a rich set of observables is available. 

We began our study with the strong prior that the rehance on the interaction between 
distance from Catholic schools and Catholic rehgion to identify the Catholic school effect 
could overcome potential objections to the exclusion of location variables and rehgion from 
the outcome equations, thereby providing convincing estimates of the Catholic school effect. 
Unfortunately, we end it with the negative conclusion that distance, religion, and distance 

^Researchers often informally argue for the exogeneity of membership in a “treatment group” or of an 
instrumental variable by examining the relationship between group membership or the instrumental variable 
and a set of observed characteristics, or by assessing whether point estimates are sensitive to the inclusion 
of additional control variables. See for example, Currie and Dimcan (1995), Engen et al (1996), Poterba et 
al (1994), Angrist and Evans (1998), Jacobsen et al. (1999), Bronars and Grogger (1994), and Udry (1998). 
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interacted with religion are all problematic instrumental variables, at least in the existing 
national data sets. 

In Section 2 we discuss the data from NLS-72 and NELS:88 that are used in the study. 
In Section 3 we present results using rehgion as the source of identification and provide some 
initial evidence on the direct eflFect of being Catholic on educational attainment. We also 
introduce and apply AET’s method of using the observables to assess the potential for bias 
from an association between the instrument and the unobservables. In Section 4 and in 
Section 5 we present results using distance and the interaction between distance and religion 
as the excluded instruments. Section 6 concludes. 

2 Data 
2.1 NELS:88 

NELS:88 is a National Center for Education Statistics (NCES) survey which began in the 
Spring of 1988. The base year sample is a two stage stratified probabihty sample in which 
a set of schools containing eighth grades were chosen on the basis of school size and pri- 
vate/public status. In the second stage, as many as 26 eighth grade students from within a 
particular school were chosen based on race and gender. A total of 1032 schools contributed 
student data in the base year survey, resulting in 24,599 eighth graders participating. Sub- 
samples of these individuals were reinterviewed in 1990, 1992, and 1994. The NCES only 
attempted to contact 20,062 base-year respondents in the first and second foUow-ups, and 
only 14,041 in the 1994 survey. Additional observations are lost due to attrition. 

Parent, student, and teacher surveys in the base year provide a rich set of information 
on family and individual background, as weU as pre-high school achievement, behavior, and 
expectations of success in high school and beyond. Each student was also administered 
a series of cognitive tests in the 1988, 1990, and 1992 surveys to ascertain aptitude and 
achievement in math, science, reading, and history. We use standardized item response 
theory (IRT) test scores that account for the fact that the difficulty of the 10th and 12th 
grade tests taken by a student depends on the 8th grade scores. We use the 8th grade test 
scores as control variables and the 10th and 12th grade reading and math tests as outcome 
measures. 

For eaeh respondent, a measure of distance from the nearest Catholic high school was 
obtained by computing the distance from the zip code centroid of the respondent’s eighth 
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grade school to the zip code centroid of the closest Cathohc high school®. Prom this in- 
formation we constructed our distance measure Di^ which is a vector of mutually exclusive 
indicators for distance less than 1 mile, 1 to 3 miles, 3 to 6 miles, 6 to 12 miles, and 12 to 
20 miles, with greater than 20 miles treated as the omitted category. Our religion indicator 
Ci is 1 if parents indicated that they are Cathohc in response to a question about rehgious 
afiihation in the base year survey and is 0 otherwise. 

Our main outcome measures are high school graduation {HSi) and college attendance 
{COLLi), HSi is one if the respondent graduated high school by the date of the 1994 
survey, and zero otherwise.^ COLLi is one if the respondent was enrolled in a four-year 
university at the date of the 1994 survey and zero otherwise.® The indicator variable for 
Catholic high school attendance, CHi^ equals one if the current or last school in which the 
respondent was enrolled was Cathohc as of 1990 (two years after the eighth grade year) and 
zero otherwise.^ Unless noted otherwise, the results reported in the paper are weighted.^® 

2.2 NLS-72 

The NLS-72 is a Department of Education survey of high school students that contains infor- 
mation on 22,652 persons who were seniors during the 1971-1972 academic year. Additional 
interviews were conducted in 1973, 1974, 1976, 1979, and 1986. The final sample sizes are 
19,489 students from 1192 public high schools and 71 Catholic high schools for the cohege 
attendance indicator variable, 14,671 students from 879 public high schools and 57 Cathohc 

® Detailed information on zip code characteristics of the eighth grade school (at the zip code level) is 
available on the NELS:88 Restricted Use files. For the NELS:88 analysis, the zip code of every Catholic 
high school in the United States in 1988 was obtained from Ganley’s Catholic High Schools in America: 
1988. The distance from a particular zip code centroid to the centroids of all the catholic high schools was 
calculated using an algorithm obtained from the U.S. National Oceanic and Atmospheric Administration. 

^We obtain similar results using a “drop out” dummy variable which equals one if a student dropped out 
of high school by 1992, or if the student dropped out of high school by 1990 and was not reinterviewed in 
1992 or 1994, zero otherwise. This variable catches dropouts who left the survey by 1990 and were either 
dropped from the sample or were nonrespondents. 

^Our major findings are robust to whether or not college attendance is limited to 4- year imiversities, 
full-time versus part-time, or enrolled in college “at some time since high school” or at the survey date. 

^A student who started in a Catholic high school and transferred to a public school prior to the tenth 
grade survey would be coded as attending a public high school {CH = 0). If such transfers are frequently 
motivated by discipline problems, poor performance, or alienation from school, then misclassification of 
the transfers as public high school students could lead to upward bias in estimates of the effect of CH on 
educational attainment. AET present evidence that this issue is of minor importance. 

^®The sampling scheme in the NELS:88 is complicated and explained in more detail in AET (2002). 
The results are somewhat sensitive to the use of sample weights, although our main findings are robust to 
weighting. Given the sampling scheme the weighted estimates are clearly preferred. 
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high schools for the math and reading score variables, and 16,276 students from 1191 public 
high schools and 71 Catholic high schools for the years of academic education variable. 

The variable Ci is 1 for students who indicated they were Catholic in response to a 
base year question about religious aflfiliation and is 0 otherwise. Distance from the nearest 
Catholic high school was recorded as the distance in air miles between the centroids of the 
zip code of residence reported in the first follow-up, and the zip code of the nearest Catholic 
high school. The foUow-up survey included an indicator for whether the respondent had 
moved between their senior year of high school and the survey date, so the 10,530 students 
who moved were assigned the mean value of distance for aU non-movers who attended the 
same high school. 

In the original design, schools with a high percentage of minority students and in low 
income areas are overrepresented, and sampling weights also vary with whether the school 
is public or private. The results are not sensitive to weighting procediues, so the estimates 
reported below are based on unweighted data. 



3 Using Religious Affiliation to Identify the Catholic 
School Effect 

In Table 1, we present univariate probit, OLS, bivariate probit, and 2SLS estimates of the 
Catholic school effect for our three separate instrumental variables. The table footnotes 
provide a list of the family background, city size, region, student characteristics, and eighth- 
grade behavioral and academic outcomes that are included in both the equations for CHi 
and the outcomes (Y^). In this section oiu focus is on the first column in which we use Ci 
as the excluded instrument and include Di but not Ci x Di in the equations for both CHi 
and Yi, In sections 4 and 5 we wiU discuss the results from the second and third column, 
respectively. 

^^The 2236 students who did not report their religious affiliation are excluded from the analysis. We 
also drop an additional 495 students for whom we could not impute distance from the nearest Catholic 
high school, reducing the sample size to 19,921. We also exclude 111 cases in which the student attended a 
non- Catholic private school, and additional observations axe lost because data for key control variables and 
outcomes axe missing. 

^^The zip code of every Catholic high school in existence in the United States is listed in the US Department 
of Education’s “Universe of Private Schools”. 

^^The 495 students who were dropped because no distance measures could be created for them either 
attended one of the 26 high schools for which there are no valid observations on distance, or did not have 
valid values for the geographic move variable. These schools were part of NLS’-72’s “backup sample”, and 
the students in this subsample were lost because they were excluded from the first follow-up. 
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In NELS:88 the 2SLS estimate for high school graduation is 0.34 (0.08). This point 
estimate is extremely large, given that the sample mean of HSi is 0.84. The bivariate 
probit estimate of the average marginal effect is a more reasonable value of 0.128, but it is 
still double the univariate probit estimate. The estimates of the effect on enrollment in a 
four-year college in 1994 are also unreasonably large, as the 2SLS coefficient of 0.40 (0.10) 
is larger than the sample mean of 0.29. The bivariate probit estimate of 0.170 is also well 
above the univariate probit estimate of 0.094. 

We obtain a different pattern in NLS-72 (bottom panel). For this data set the analysis 
conditions on enrollment in 12th grade, so one should not expect these results to exactly 
mirror those in NELS:88. The probit estimate of the effect of CHi on college attendance 
is 0.068, which is roughly equal to the two stage least squares estimate of 0.06 (0.04). This 
apparent similarity should be interpreted carefully, as the 2SLS standard error is substantial. 
In fact, the point estimate is not significantly different from zero even though it imphes a 
large Cathohc schooling effect. 

The bivariate probit estimate is only -0.002, but it should be kept in mind that the 
source of identification in the bivariate probit case is a complicated nonlinear function of 
the variables in the model for CHi and not simply Ci, even though only Ci is excluded from 
the outcome equation. In particular, we suspect that the interaction between Ci and Di 
plays an important role and leads to a reduction in the point estimate relative to 2SLS for 
reasons that will become clear when we discuss the results based on Ci x Di. We analyze 
the bivariate probit in the appendix and conclude that identification comes primarily from 
the functional form assumption rather than the exclusion restrictions. Thus we focus on the 
2SLS results when thinking about the validity of particular instruments. 

Thble 2 reports OLS and two stage least square estimates of the effect of Catholic high 
school on test scores in NELS:88 and a variety of outcomes in NLS-72. Column (1) shows 
that the 2SLS estimates are larger for both NELS test scores than the single-equation ones, 
although the 2SLS coefficients are noisy. The standard deviation of these tests is 10, so 
the 2SLS estimate of 2.64 implies a large impact on 12th grade math scores. However, 
the fact that the OLS estimates are uniformly smaller indicates that either 2SLS is biased 
upward or that Catholic high school students are actually negatively selected on the basis 

^'^The NELS:88 results change very little when we condition the analysis on making it to 12th grade or 
on HS = 1, so we cannot attribute the similarity of the results from 2SLS and single-equation methods in 
NLS-72 but not NELS: 88 to the fact that NLS-72 is limited to those who have made it to 12th grade. 
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of unmeasured factors which are correlated with test scores. The NLS-72 test score results 
follow the opposite pattern-2SLS estimates are negative while OLS is large and positive for 
both reading and math. It should be kept in mind that the NLS-72 results do not control 
for eighth grade achievement. 

To summarize, in NELS:88 the 2SLS estimates using Ci as the exclusion restriction imply 
that the Catholic school effect is very large, particularly for educational attainment. The 
results based on NLS-72 are more mixed but are consistent with a substantial positive effect 
on educational attainment. One might be tempted to conclude that IV estimates, while 
unreasonably large, bolster the probit and OLS evidence that the true effect is substantial. 
In the remainder of this section, we explore whether this is the right interpretation. 

3.1 Comparing the Characteristics of Catholics and non-Catholics 

Column (1) of Table 3a presents sample means of a set of family background characteristics, 
student characteristics, eighth grade outcomes, and high school outcomes in NELS:88, and 
Column (2) shows the difference between Catholics and non-Catholics in these means.^^ The 
table shows that Cathohcs are 7 percentage points more likely to graduate high school and 
8 percentage points more hkely to be enrolled in a four year college in 1994. Differences in 
tenth and twelfth grade test scores are more modest but all show a significant advantage for 
Catholic students. If Catholic was as good as randomly assigned, these differences would be 
entirely attributed to the fact that Catholics are more likely to attend Catholic high school. 
It would then be troubling if Catholic appeared to be related to variables determined prior to 
high school enrollment. Consequently, we begin our evaluation of Catholic rehgion as an ex- 
cluded instrument by following the common practice of simply comparing the characteristics 
of Catholics and non-Cathohcs in both NELS:88 and NLS-72. 

Unfortunately, differences by Ci appear in many of the family and student characteristics 
and eighth grade outcomes in Table 3a. There is a modest positive association between 
Cathohc religion and parental educational expectations, with a gap of 0.04 in the fraction of 
parents who expect their children to attend some college and 0.03 in the fraction who expect 
at least a college degree.^® While the differential in family income is positive, it is negative in 

^^In Table 3a the outcome variables are weighted with the same weights used in the regression analysis, 
so that the 10th and 12th grade test scores are weighted using first and second follow-up panel weights, 
respectively, and high school graduation and college attendance are weighted by third follow-up weights. All 
other variables are weighted using second follow-up panel weights. 

^^Some of the variables used in our multivariate models are excluded from Table 3a to keep them man- 
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mother’s and father’s education. However, Table 3a also shows that Catholic students have 
favorable characteristics across a broad set of measures available in eighth grade, such as test 
scores, grades, and teacher evaluations of the student’s behavior. Among these eighth grade 
variables, only the “impreparedness index” variable does not vary favorably with Ci. The 
discrepancy in the fraction of students who repeated a grade in grades 4-8 is -0.03, and the 
gap in the fraction of students who are frequently disruptive is -0.02. The existence of gaps in 
favor of Cathohc students across several dimensions suggests that Cathohc and non-Catholic 
students diflFer in many respects, some of which may be unobservable to empirical researchers. 
Since these diflFerences also contribute to high school and post-high school outcomes (see AJET 
for evidence), doubts arise regarding the validity of using Ci as an instrumental variable for 
Catholic high school attendance. 

In NLS-72, the differences are less pronounced, although it appears that overall Catholic 
religion has a weak positive association with favorable family backgroimd characteristics. 
Log family income is 0.07 higher for Catholics, who are also five percentage points less hkely 
to be members of families which meet NLS-72’s definition of low socio-economic status. There 
are also essentially no differences in parental education levels or pre-high school student 
educational expectations, with an insignificant negative gap of -0.01 (0.007) in an indicator 
for whether the student decided to attend college before high school. 

Given the overall picture of Tables 3a and 3b, we anticipate that the use of Ci as an 
instrumental variable will likely result in positively biased estimates of Catholic schooling 
effects in NELS:88, and perhaps a small positive bias in NLS-72, although it is difficult to 
gauge the extent of the bias. The richness of the NELS:88 data permits us to use two more 
formal procedures to gauge its magnitude and direction. 

3.2 The Effect of Catholic Religion for Students from Public Eighth 
Grades 

One way to assess the endogeneity of Cathohc religion is to identify a sample of persons 

for whom Catholic high school is not a serious option, and then interpret the coefficient 

on Ci in a single equation model as an estimate of the direct effect of Cathohc religion on 

the outcome. Only 0.3% of public school eighth graders in our effective sample go on to 

attend Cathohc high school; the percentage is 0.7% among pubhc eighth grade attendees 

ageable given sample sizes. The expectations variables in Table 3a are excluded from our outcome models 
because if Catholic school has an eflFect on outcomes, this may be influence expectations. 
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whose parents are Catholic. For the moment we abstract from the fact that restricting the 
analysis to the pubhc eighth-grade sample will induce some selection bias in estimates of the 
direct relationship between Catholic religion and the outcome. We argue at the end of the 
section that taking account of such selection bias strengthens the evidence against Ci as an 
instrument. 

To motivate the exercise in this section suppose that 



( 1 ) 



Yi^aCHi + X'fi + Si, 



where Xi is uncorrelated with Si. The problem is that CHi and potentially Ci may be 
correlated with the error term. If we estimate a by 2SLS using Ci as an instrument for CHi 
the bias is 



2SLS bias 



Cov{Ci,Si) 

XVar{Ci) 



where Ci are the residuals of a regression of Ci on Xi^ (f> is ? ^md A is the probability 

limit of the coeflficient on Ci from the first stage regression. Now suppose there is an event 
Pi on which we can condition for which PT{CHi = 1 | — 0. In our application this event 

is attendance of a pubhc eighth grade by individual i. Assume that the joint distribution 
of {Xi,Ci,Si) is independent of pi. Consider a regression of Yi on Xi and Ci conditional on 
Pi. Under these conditions, the coefficient on Ci in (1) will converge to (j). Since we have a 
consistent estimate of A from the first stage regression, we can obtain a consistent estimate 
of the bias ^ by taking the ratio (p/X or by estimating the parameter in the regression 
model 



(2) Yi = X''r + [CiX]^ + Ui 

on the public eighth grade sample. 

In colunm 1 of Table 4 we report estimates of the bias parameter using this approach. 
We present separate equations estimated for HSi^ COLLi^ and the 12th grade math and 
reading test scores. The vector Xi includes all of the other controls that were included 

^^Eliminating the 36 students who attended public 8th grade and went on to Catholic high school has 
little effect on the results. 
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in our models in Tables 1 and 2. For ease of comparison, the table also presents the 
corresponding 2SLS estimates from Table 1 and 2. 

The results are striking-the implied bias in the 2SLS estimate is 0.34 (0.08) for HSi^ which 
is identical to the 2SLS coefficient itself.^® The large potential bias should raise a great 
deal of concern about using Catholic as an instrument, particularly given the remarkable 
similarity between the magnitudes of the bias and the 2SLS estimate. In our view, this 
evidence alone is sufficient to rule out Ci as a useful instrument. 

In the college attendance case the (unreported) estimate of (f) is 0.038 (0.013). Cathohc 
students are nearly four percentage points more hkely to enroll in a four year college than non- 
Catholics even when Catholic high school is not a serious option. This relationship implies 
a bias of 0.29 (0.11) in 2SLS estimates, so it seems hkely that the large 2SLS estimates in 
Table 1 result from the endogeneity of Ci with respect to both high school graduation and 
college attendance. Similar calculations imply that the math test score estimate from Table 
2 can largely be explained by potential bias of 1.85 (1.41) for the 12th grade math scores. 
Part of the coUege attendance and test score effects may be “real,” as these large corrections 
are still smaller than the 2SLS point estimates, but the substantial evidence of endogeneity 
of Ci combined with the imprecision of the estimates prevents any firm conclusions about 
the effect of Catholic high school on these outcomes. 

We now return to the selection problem induced by focusing only on public eighth graders. 
The analysis in this section has treated public eighth grade attendance as if it were randomly 
assigned. We typically would expect positive selection of Catholics into Catholic grade 
schools. That is. Catholic students who attend Catholic grade schools are likely to have 
higher values of Si in equation (1) than Cathohc pubhc school students. Since non-Catholics 
are much less likely to attend Cathohc schools this effect will lead to a negative bias in 
C(yv{Ci^Ei) when we condition on pubhc school attendance.^^ This would imply that our 
estimates of (f)/\ are biased downward, which makes the results in this section even more 
surprising. 

see how we arrive at this figure, note that the estimate of (j) in the HS equation is 0.044 (0.011). That 
is, the graduation probability among students who go to public eighth grade is estimated to be 0.044 higher 
for Catholics than non-Catholics, even though hardly any of these students attend Catholic high schools. 
Since A is estimated to be 0.130 (0.009), the bias is approximately 0.34 (=0.044/0.130). 

^^To see this in a simple case, abstract from observables so that = Ci, and assume that non-Catholics 
do not attend Catholic schools, that E{£i | Cj) = 0 unconditional on pi^ and that there is positive selection 
into Catholic eighth grades so that E{Si \ Ci = l,pf) > E{£i | Ci — l,Pi), where pf is the complement of pi. 
This implies that E{£i \ Ci — l,Pi) < 0 and thus the bias is negative. 
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3.3 Using the Observables to Assess the Bias from Unobservables 

In this section we extend the methodology of AET to assess the potential bias in the instru- 
mental variables estimates. For simplicity we focus on the linear case when illustrating the 
procedure although the methods are also applicable to non-linear models. 

Let the outcome Yi be again determined by 



(3) Yi = aCHi-^X'^-^Su 

where 7 is defined so that cov(Si, Xi) = 0. 

CHi is potentially endogenous and thus correlated with We assume that our in- 
strument Zi does not influence Yi directly, but is correlated with CHi. However, Zi is not 
necessarily a valid instrument because it may be correlated with Si. 

Define /?, tt, and A to be the coefficients of the least squares projections 



(4) 


Fvoi{Zi\Xi) = 


X'iT, 


(5) 


Proj {CHi 1 


II 


X'i/3 + XZi 


Define Vi and Ui 


as the residuals of these projections, so that 


(6) 


Vi = 


- X'iTV 




(7) 


iii = 


CHi - X'iP 


-XZi. 



and note that Vi and Ui are orthogonal to Xi by construction. Consider the regression of Yi 
on {X-^ -h XZi) and Xi. The coefficient on {X^P -h XZi) in this regression converges to 



( 8 ) 



a “ a + 



cov {Vj, Sj) 
Xvar{vi) 



One can see from (8) that the crucial assumption justifying the validity of Zi as an instrument 
is that 



(9) cov{vi,Si) = 0. 

Under this condition, 2SLS yields a consistent estimate of a. 

In contrast we consider the case in which Zi is not a valid instrument and the researcher 
does not have a strong prior about how it is determined. In particular, rather than assume 
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that the choice of Xi ensures that Vi is uncorrelated with Si^ assume that Xi is random subset 
of all of the factors that determine Yi in addition to CHi^ In AET we derive Condition 1 -IV 
(not shown), as an alternative to the assumption cov{vi^Si) = 0. Condition 1 -IV says that 
the effect on Zi of a imit change in the index of observables that determine Yi and the index 
of unobservables is the same. The condition can be written as 

cov{Vi,ei) ^ coviX^TT^X'^) 
var{Si) var{X[^) 

Describing the assumptions that lead to (10) requires that we introduce more of the 
notation from AET. Let the outcome Yi be determined as 



Yi = aCHi + W[T 

= aCHi + X^^x+ii. 

where Wi is the vector of characteristics (observed and imobserved) that fully determine Yi 
and r is the causal effect of VKi on In the second part of the equation X is the vector of 
observed variables, is the corresponding subvector of F, and the error component is an 
index of the imobserved variables. Because it is extremely imlikely that the control variables 
Xi are all unrelated to we work with (3) where 7 is defined so that cov(ei,Xi) = 0.^^ 

The precise conditions that imply Condition 1 -IV are given in AET, but basically it 
requires the following three types of assumptions: 

1 . the elements of Xi are chosen at random from the full set of factors Wi that determine 

2 . the number of elements in Xi and Wi is large, and none of the factors dominates the 
distribution of the instrument Zi or the outcome 1 ^, 

3. the relationship between the observable elements Xi and the imobservables obeys a 

very strong assumption that is similar to, but weaker than the standard assumption 

cov(Xij^i) = 0 that is maintained when applying instrumental variables estimators.^^ 

Consequently, 7 captures both the direct effect of Xi on 1^*, Fx, as well as the relationship between Xi 
and the mean of Note that W-T — X^Tx + 

^^Mean independence of and Xi is maintained in virtually all studies of selection problems, because 
without it, a is not identified even if one has a valid exclusion restriction (the exception is when the instrument 
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Under these assumptions the relationship between the indices of observables in the equa- 
tion for Zi and the outcome equation will be the same as the relationship between the indices 
of unobservables in the two equations, as implied by (10), 

In the case in which Zi is an indicator variable such as Cj, (10) can be rewritten as 



E{Si I Zj = l)- EiSj \Zj = 0) ^ E{X[^ \Zj^l)- EjXlj \Zj = 0) 

^ ^ Var{Si) Var{Xlj) 

The term normalized shift in the index of observables in the 

outcome equation that is associated with Zi, while the term is the corre- 

sponding normalized shift in the distribution of unobservables. This is a formalization of the 
common practice of checking for a systematic relationship between an instrumental variable 
and the mean of the elements of Xi, Intuitively, if one estimates 

finds that it is substantially different from zero, one may be worried that the null hypothesis 
E{Si I Zi) = 0 is wrong. 

We can use (11) to approximate the amount of bias in 2SLS estimates of Catholic school- 
ing effects if selection on unobservables is similar to selection on observables. Combining 

is uncorrelated with Xi as well as as when the instrument is randomly assigned in an experiment). If 
the observables are correlated with one another, as in most applications, then the observed and unobserved 
determinants of Yi are also likely to be correlated. 

Assume that the conditional expectation is linear. Following the notation above, define 7 and €i to be the 
slope vector and error term of the “reduced form” 



E {Yi - aCHi I Xi) = Xa 
Yi-E{Yi-aCHi\Xi) = Si. 

Let the projection of Zi on Wi be 

Proj {Zi I Wi) = W'U. 

One may easily adapt the analysis in appendix A. 2 of AET to obtain a sufficient set of assumptions for 
Condition 1-IV in that paper or equivalently, (10) above, to hold. The sufficient assumptions are assumptions 
1. and 2. above and 

ES-00 E E (n,T,_,) ^ ES-oo (n,T,_,) 

^ ^ T,T=-ooE{WijWij^i)E{rjrj_i) ZT=-oo-B{WijWij-i)E{rjrj_i)’ 

where Wij is the component of Wij that is orthogonal to Xi. Roughly speaking (***) says that the regression 
of Z, on — aC Hi is equal to the regression of the part of Zi that is orthogonal to Xi on the corresponding 
part of — aCHi. One can show that this condition holds given assumptions 1 and 2 under the standard 
assumption E{^i \ Xi) =0. However, E{^^ | X^) = 0 is not necessary for (***). For example, the analysis in 
appendix A. 2 AET implies that (***) will also hold if E (IIjTj_£) is proportional to E (FjTj_^) regardless 
of the correlations among the Wij . 
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equations (3)- (7), one can rewrite 



Yi — a.\vi + X[ [7 + a (/? + Att)] + (xui + Si, 



Since vi is orthogonal to Xi and Ui, the asymptotic bias from two stage least squares would 
be 



plim(S — a) 



( 12 ) 

(13) 



COv{\Vi, Ej) 
var (Xvi) 
var{Zi) 

Xvar (vi) * 
var (Zi) Var{Si) 



Xvar {vi) Var(Xlj) 



Zi = l)~ E{si I Zi = 0)] 

[E{X'^\Zi = l)-E{X'^\Zi = Q)]. 



where we have used (11) to obtain (13) from (12). The hypothesis of equal selection on 
observables and unobservables provides a way of identifying [E{ei | Z^ = 1) — E{si \ Zi — 0)], 
and therefore the asymptotic bias of instrumental variable estimates, since the other terms 
in the last line of (12) are readily and consistently estimable. AET develops extensions to 
the case of latent dependent variables, so both probit and linear 2SLS bias calculations are 
given where appropriate. 

We wish to stress at the outset that one should not make too much of the specific 
estimates of bias, which are based on strong assumptions about the symmetry of selection of 
observables and unobservables. In AET, we argue that the relationship between the indexes 
of unobservables that determine CHi and Yi is likely to be weaker than the relationship 
between the indexes of observables, in part because many of the factors that determine 
graduation and college attendance are determined after 8th grade and are excluded from Xi 
by design. We are less clear about the force of this argument in the case of Ci and the 
other instruments we consider. The variables Ci, Di^ and Ci x Di could all be correlated 
with pre and post 8th grade influences on Yi that are not correlated with CHi^ but these 
correlations could be stronger or weaker than the link between factors that determine C Hi 
and Yi, However, we suspect that they are considerably weaker, which means that bias 
estimates will be too large in absolute value. 

One may refine the bias calculations to account for the fact that the variation in the 
instrument may only be over a specific dimension. For example, Di only varies across zip 
code, and so must be orthogonal to variation in X'7 and in d that is within zip code. 
Consequently, we adjust the bias estimates by using variance in E{XIj) across zip codes 
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relative to the variance within zip codes as a guide to the variance in E{Si \ Di) relative to 
the cross area variance in 

Column (1) of Table 5 presents the results, which are quite striking. In the case of high 
school graduation, for linear 2SLS we calculate a bias of 0.52 (0.23) in a if we include Di 
among the set of variables used to form the index of observables and 0.84 (0.26) if we exclude 
it. These are both huge potential biases, greater in magnitude than the implausibly large 
2SLS point estimate, which is repeated in this table for convenience. The table reports a 
similar calculation in the 2SLS estimate of a when COLLi is the dependent variable. In 
this case, the bias estimate imder the assumptions leading to (11) is 0.45 (0.21), which is 
shghtly larger than the 2SLS estimate of 0.40. If selection on imobservables follows the 
same pattern as selection on observables, there is a huge bias in the IV estimates when Ci 
is used as an instrument, at least for the cohort of children sampled in NELS:88.^^ The 
results reinforce our conclusions based on the public 8th grade sample. However, we also 
wish to stress that the bias estimates have large standard errors and are best interpreted as 
a warning sign of potential trouble rather than a precise estimate of the what the bias is. 

The bottom panels of Table 5 repeats the calculations for 12th grade test scores. These 
calculations use estimates of the reliability of the NELS:88 tests to provide a rough adjust- 
ment for the fact that much of the variance in Si is due to noise in the tests and thus is 
umelated to The calculations suggest that there is the potential for substantial bias 

22with sibling data one could refine the calculations to some degree based on the observation that the 
effects of parents religious background is common to siblings. At least in the context of an additively 
separable model, the connection between Ci and £i must involve the component of £i that is common to 
siblings. One could use the value of | Ci = 1) — \ Ci = 0)] relative to the cross family variance 

in X[^ as a guide to [E{£i | Ci = 1) — E{£i \ Ci = 0)] relative to cross family variation in £i. Unfortunately, 
NELS:88 does not identify siblings and, because of its design, is likely to include only siblings who are twins 
or very close in age. 

^^This conclusion is also supported by calculations not reported that use a two stage probit procedure. 
See Elder (2002) for details. 

^^The adjustment is performed by multiplying the estimate of plim(d — a) based on (12) by {reliability- 
R^)/{1 — where reliability is the estimate of the reliability of the particular test, and R^ is the R^ of 
the model for the particular test. To see the justification, let the composite error term be £* — £ -h c where c 
is the component of test scores due to noise in the test. One minus the reliability of the test is an estimate 
of var{<;)/var{Yi -h c) where Yi is the true test score. The value 1 minus the R^ of the test score model 
is an estimate of [var{e) var {<;)]/ var{Yi -h c),and note that var{e) = [var{£)/{var{e) var{<;))]var{£*). 
Consequently, 

yar{e) = 

The R^ is 0.60 for 12th grade reading and 0.74 for 12th grade math (using the 2SLS estimate of the model 
and ignoring the correlation between CHi and £i), and the reliability is 0.85 for 12th grade reading and 0.94 
for 12th grade math. Consequently, the correction scales down the bias estimates by 0.625 for reading and 
0.770 for math. 
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when using Ci as an instrument, but the estimates are very imprecise. In the case of math 
the bias estimates of 2.02 (0.75) and 1.87 (0.74) (depending again on whether Di is used in 
the calculations) preclude any firm conclusions. In general, we cannot rule out the possi- 
bihty of a positive effect of Catholic high school attendance on achievement test scores, but 
the large potential biases are suggestive that the use of Ci as an instrument is not a reliable 
way to assess the magnitude of these effects. 

The conclusion that we draw from these calculations is that IV procedures based on 
Ci lead to huge point estimates but may also be subject to a great deal of bias. In this 
circumstance, Ci is not a useful instrumental variable despite its powerful association with 
CHi. This inference is fully consistent with the evidence for a large direct association 
between Ci and the outcomes in the public 8th grade sample. We do not have a good 
understanding of why the gap between the IV estimates of the Cathohc school effect and 
the probit or linear probabihty estimates are so much larger in NELS:88 than in NLS-72 
or in High School and Beyond (See Evans and Schwab, 1995). Unfortunately, we lack the 
rich set of primary school data required to use the relative degree of selection on observables 
to explore the discrepancy in IV results across data sets. The variability across data sets, 
which in part may reflect changes over time in the composition of the Catholic population 
in the U.S., is an additional reason to be cautious about the use of Ci as an instrument. 



4 Instrumental Variables Estimates using Proximity to 
Catholic Schools 

In this section we evaluate proximity [Di] as a source of identifying variation. The main 
theoretical justification for Di is that it should affect the costs of attending a Cathohc 
high school, while the main concern is that the location of Cathohc high schools may be 
associated with characteristics of the population, public schools, post-secondary schools, 
and labor market, ah of which influence outcomes. 

In Column (2) of Table 1 we report estimates with Di as the excluded instrument. It 
is important to re-emphasize that because of the nonhnearity of the bivariate probit model, 
both Di and the interaction between Di and Ci play a role in identification in the bivariate 
probit case (as well as the method of using two-stage probits), so the 2SLS estimates are 
cleaner in this regard. The 2SLS estimate of -0.04 (0.10) is surprising but too imprecise for 
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US to draw any inferences from it. The 2SLS estimate for COLLi is 0.31 (0.11) in NELS:88 
and 0.44 (0.20) in NLS-72. Both estimates are much larger than the estimated marginal 
effect of 0.085 from the univariate probit in NELS:88 and 0.070 from NLS-72. Column (2) 
of Table 2 presents the results for test scores in NELS:88 and NLS-72. These coefficients 
vary across specifications, but for the NLS-72 test scores they imply very large effects. On 
their face, these findings appear implausible, so we next explore the degree to which they 
are influenced by bias using the same methods as in section 3. 

In Column (3) of Table 3a we report the relationship between a wide set of observables 
in NELS:88 and a student’s distance from the nearest Cathohc high school. For simplicity 
we collapsed the vector Di into a dummy variable which is equal to 1 for person i if she 
lives less than 6 miles from the nearest Catholic high school and zero otherwise, and present 
the difference in these means by D6i, Among the eighth grade measures, such as teacher 
evaluations of the student’s behavior, there is httle difference between those who live close to 
Catholic high schools and those who do not. However, there is a positive relationship between 
Di and most of the family background measures. There is also a positive association between 
proximity and both student and parental educational expectations. Similar differences by 
D6i appear in NLS-72 (Table 3b). These differences in family motivation and students’ 
home environment introduce the possibility that there might also be unmeasured differences 
which could affect outcomes and lead to bias in models using as an instrumental variable 
in both NLS-72 and NELS:88. 

In column (2) of Table 4 we report estimates of the bias coefficient based on the 
equation 

(14) Yi = X[y^[D';X]i;^uji 

for public eighth graders from NELS:88. In (14), D-A is the index of distance dummies 
weighted by their coefficients A in the first stage equation for CHi. The estimate of 'ip is 
-0.05 (0.12) in the equation for HSi and 0.37 (0.12) in the equation for COLLi. There is not 
much evidence for bias in the HSi equation given the large standard error, but this is not 
surprising given that the 2SLS estimate is also noisy and does not indicate a positive effect. 
For COLLi^ the imphed bias is slightly larger than the 2SLS estimate, reaffirming the notion 
that one should not put too much stock in inferences using as an instrument for college 
attendance, at least in NELS:88. In the case of reading the bias check is iminformative 
given the large standard error on For 12th grade math scores, the evidence in favor of a 
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positive effect of CHi is dampened by the fact that implied bias estimates are large in this 
case as well. Given both the evidence of endogeneity and the large standard errors of the 
2SLS estimates, we conclude that the 2SLS estimates using Di are not useful in drawing 
conclusions regarding test scores.^^ 

Finally, we apply the AET methodology for assessing the potential bias due to selec- 
tion on unobservables. The extension of the methods to account for fact that Di is a 
vector is straightforward, with the relevant condition analogous to (10) being = 

The results are in Column (2) of Table 5. The estimates computed under the 
assumption of equal selection on observables and imobservables show the potential for large 
positive biases for both HSi and COLLi. The fact that the bias estimates for the two differ- 
ent outcomes have the same sign is not surprising, since it reflects the similarity in the effects 
of Xi on the two education outcomes. While the specific bias estimates are noisy and are 
probably overstated for reasons discussed above, the large estimate for COLLi suggests that 
the 2SLS coefficients are not informative. Finally, for 12th grade math scores, the estimates 
of 1.72-1.76 (depending on whether Ci is included in the calculations involving A'7) again 
do not preclude a small Catholic schooling effect, but instruiriental variables estimates using 
Di do not provide a reliable gauge of the strength or even the sign of the effect. 

Although we are unable to directly evaluate Di as an instrument in NLS-72 other than 
the informal analysis based on Table 3b, the calculations based on NELS:88 cast further 
doubts on the vahdity of the large estimates obtained for outcomes in this data set. 

5 Instrumental Variables Estimates using the Interac- 
tion 

Finally, we turn to the interaction between Ci and Di as the source of identifying variation. 
In Column (3) of Table 1 we report probit, bivariate probit, linear probability and 2SLS 
estimates of the effect of CHi on high school graduation and college attendance. Column (3) 
of Table 2 presents results for test scores. All of the models include both Ci and Di among 
the controls. 

should be noted that the public 8th grade analysis is likely less informative for D{ than for C{ because 
of the likelihood that distance from Catholic elementary school and distance from Catholic high school are 
closely related. Consequently, selection issues may have a bigger effect on the coefficient on the index when 
the distance variables are involved than when only religion is involved. 

^®Note that equation (10) can be written similarly as because Si and Vi are 

orthogonal Xi. 
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The results vary across the different outcomes. In the case of educational attainment, 
the bivariate probit and 2SLS point estimates are negative in two of the three cases. For 
test scores, the 2SLS estimates lie below the OLS ones in three of the four cases, with 12th 
grade math score coeflScients being fairly large and negative in both data sets. However, 
in all cases in NELS the standard errors are too large in relation to the difference between 
the OLS and 2SLS estimates for the 2SLS estimates to help much in modifying conclusions 
about a. This is less true in the NLS-72. 

We have investigated the properties of the instriunent using the same set of procedures 
that we used for Ci and Di with the same bottom line. Given the imprecision in some of the 
estimates, the lack of previous work using Ci x Di as an instrument, and space considerations, 
we will not get into the details.^^ However, the weight of the evidence in Tables 1-5 leads us to 
be very skeptical of the interaction as an exclusion restriction. In particular, there is evidence 
in both data sets that the difference between Catholics and non-Catholics in favorable family 
backgroimd characteristics rises with distance from the nearest Catholic high school. If the 
link between Ci x Di and Si followed the same pattern, the 2SLS estimates would be biased 
downward. We suspect that this underlies that the negative coefficients for some outcomes in 
both data sets, particularly NLS72. We conclude that Ci x Di is not a very useful soiurce of 
variation for the purpose of estimating the Catholic school effect, at least not in the context 
of NELS:88 or NLS-72. 

Column (4) of Table 3a we report the coefficient on Cx D6{ from regressions of the various background 
and outcome variables indicated in the rows on Ci, and C{ x D6{, The results for the eighth grade measures 
are mixed, with Ci x being positively associated with indicators for whether the student got into a fight 
at school, but negatively correlated with the “repeated grade” indicator. There are also slight comparative 
advantages in eighth grade GPA and reading scores. In contrast, family background, student expectations, 
and parental expectations are generally negatively correlated with Ci x D6i^ with striking differences in 
parental education levels and expectations. 

For NLS-72, the estimates in Table 3b imply that the difference in mother’s and father’s education between 
Catholics and non- Catholic students who live within 6 miles of a Catholic high school is 0.33 and 0.32 years 
lower, respectively, than the difference among Catholic and non Catholic student who live more than 6 miles 
from a Catholic high school. The incomes of Catholics relative to non-Catholics also rise with distance, 
and all of these figures are nearly identical to the corresponding ones in NELS:88. Additionally, student 
educational expectations are strongly correlated with Ci x jD 6^, with a coefficient of -0.06 (0.016). We have 
not investigated why low SES Catholics are disproportionately located near Catholic liigh schools, but if the 
unobservable parental traits that influence the outcomes we study follow a similar pattern, then our 2SLS 
estimates of the effect of Catholic schools are likely to be negatively biased for both the NLS-72 and NELS:88 
cohorts. 
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6 Conclusion 



We present evidence on the validity of using three sources of variation Cathohc school 
attendance — rehgious afiihation, proximity to Catholic schools, and the interaction between 
religion and proximity — as a way to identify the eflFect of attending Catholic high school. 
The simplest evidence comes from the relationship between the instrument candidates and 
the means of a large set of observable measures in NELS:88 and NLS-72. In NELS:88, we use 
the fact that very few students who attend public eighth grade go on to attend Cathohc high 
school as the basis for interpreting the association between an outcome and an instrument 
in a sample of public eighth graders as an estimate of the direct link between the instrument 
and the outcome in question. The final approach applies a method introduced in AET that 
takes advantage of the rich set of observable demographic, family background, and eighth 
grade outcome data in NELS:88. The idea is that if the observed variables included as 
controls are representative of the factors that determine the outcomes, then the relationship 
between observables and the instruments can be used as a guide to the relationship between 
the error term in the outcome equation and the instruments. 

We will not attempt to restate aU the results, which are sometimes contradictory across 
outcomes and data sets. Our main conclusion is that none of the candidate instruments is 
a useful source of identification of the Catholic school eflFect, at least in the NELS:88 data 
set. For example, we find a strong relationship between Catholic rehgion and educational 
achievement in the sample of public eighth graders, who almost never attend Cathohc high 
school. We obtain similar results for distance from the nearest Cathohc high school in the 
case of cohege attendance. We also find a fairly strong relationship between the instru- 
ments and in the index of observed variables that determine the outcomes. Although we 
cannot formally evaluate the magnitude of bias in NLS-72, the strong relationship between 
observables and distance in these data, in conjunction with the NELS:88 findings, raises the 
likelihood of serious doubts that the results found are due to a genuine causal eflFect. 

We wish to stress that we are not advocating literal interpretation of specific estimates 
of bias based on the pubhc eighth grade sample or the AET methodology. However, the ev- 
idence strongly suggests that the candidate instruments axe not valid instrumental variables 
for Catholic high school. Future research on the effects of Catholic schooling will hopefully 
introduce new methods, such as those described in AET, which do not necessitate exclusion 
restrictions. Alternatively, future work may involve either new exclusion restrictions alto- 
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gether or different measures of religion or proximity to Catholic schools than the ones that 
we and others have considered. Finally, experiments along the hnes of Howell and Peter- 
son (2002), while difficult to run, have large advantages in identifying the effect of Catholic 
school attendance on outcomes. 
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Appendix: A Compairison Between Bivariate Probits 
and Two Stage Least Squares 



Evans and Schwab (1995) and Neal (1997) apply bivariate probits of Catholic schooling and 
an educational outcome such as either high school graduation or college attendance using 
data from High School and Beyond and NLSY, respectively. Both papers emphasize the 
importance of an exclusion restriction in the model for identification. As we have already 
noted, Evans and Schwab (1995) primarily use Cathohc religion, excluding it from the out- 
come equation but including it in the Catholic schooling decision. Neal (1997) uses an 
indicator for Catholic religion along with coimty level measures of the density of Catholics 
and the availabihty of Catholic schools. Both of these papers find positive effects of Catholic 
schools that are estimated fairly precisely. The bivariate probit results reported in this 
paper generally follow the same pattern, with estimates being much more precise and rea- 
sonable than hnear specifications. It is therefore worth investigating the reasons why our 
instrumental variables results are so noisy and in many cases seem unreasonable, while the 
bivariate probits seem to show plausible results that are precisely estimated. 

At this point it is useful to more closely examine identification in the bivariate probit 
model. The specification used in Neal (1997), Evans and Schwab (1995), and here is 

CHi = l{g{Xi) + Ui>Q) 

Yi = l{aCHi + f{Zi)+Si>{i), 

where !(•) is the indicator function taking the value one if its argument is true and zero 
otherwise. Identification of the a coefficient is the primary focus of these studies. This 
model is similar to other types of selection models (see, e.g., Heckman, 1990, Cameron and 
Heckman, 1998, or Taber, 2000), so we appeal to the results in that literature. 

As is well known, identification of a essentially requires two assumptions: 

1. Either (a) parametric assumptions on the distribution of the error terms or (b) support 
conditions on g{Xi), 

2. Either (a) an exclusion restriction specifying that a variable belongs in the Catholic 
schooling equation but not in the selection equation or (b) parametric restrictions on 
/ and g. 
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That is, identification can be achieved by combining either 1(a) or 1(b) with either 2(a) 
or 2(b). Evans and Schwab (1995) and Neal (1997) implicitly assume that identification 
comes from the exclusion restrictions. However, both papers (and this one) also assume g 
and / are linear, i.e., g{Xi) — X[^ and — Z'7, which can be shown to satisfy 2(b) 

so that an exclusion restriction is not necessary. Since identification can be achieved from 
either the exclusion restriction or the hnearity assumption, in practice it is difficult to know 
which assumption drives identification in the empirical apphcation. 

Evans and Schwab (1995) experiment with both bivariate probits and two stage least 
squares. They also employ two different instruments. Catholic religion and the percentage of 
Catholics in the county, which are similar to Neal’s (1997) exclusion restrictions. When they 
run two stage least squares, they find implausible estimates in some specifications, depending 
on the specific exclusions maintained. Neal (1997) does not report results based on linear 
2SLS. 

In order to better assess what is identifying the bivariate probit models, as well as facil- 
itate an easier comparison between the results of this paper and the previous literature, we 
examine the sensitivity of our results from NLS-72 to different specifications using bivariate 
probits. We use a sample design based loosely on Neal (1997), in that we look at individuals 
from urban areas and examine separate effects for blacks and whites.^® The results are re- 
ported in Table Al. We obtain results which are similar to Neal’s in several respects. First, 
the univariate probit coefficient of 0.640 (0.198) implies a large positive effect for non- whites. 
Second, the coefficient from a bivariate probit specification which uses Neal’s exclusion re- 
strictions for urban minorities, Cathohc rehgion and the county-level ratio of Cathohcs to 
the overall population, is actually larger than the univariate one-0.879 (0.523)-although this 
difference is not significantly different from zero. Third, the estimates appear at first glance 
to be of a reasonable magnitude. In particular, the probit coefficients are comparable to the 
ones reported both in Neal (1997) and in Table 1 of this paper. However, the marginal ef- 
fects of 0.239 and 0.329 for the univariate and bivariate models, respectively, are suspiciously 
large. 

Table Al also shows that for urban minorities, the estimated bivariate probit coefficients 
are relatively insensitive to exclusion restrictions, and appear to be largely driven by the 
functional form assumptions embedded in these models. To see this, note that the precision 

have not replicated the analysis for NELS:88 for several reasons. Most importantly, we could not 
accurately match students to counties, as no coimty- level identifiers axe available in these data at present. 
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of the estimates does not vary much with specification, even when only a ‘Sveak” instrument 
such QS Ci X Di is excluded — or there are no excluded instruments at all (bottom row). 
The standard error of the coefficient on CHi is smaller in both of these cases than when 
the more powerful instrument Ci is excluded, which seems at odds with the notion that 
the exclusions are driving identification. In contrast, 2SLS estimates swing wildly across 
specifications, with the results being similar to Evans and Schwab (1995) and our own earlier 
results; we typically find improbably large effects with standard errors that are sufficiently 
large that any estimate within the realm of plausibility would not be significantly different 
from zero at conventional levels. In the most precisely-estimated specification involving all 
three exclusion restrictions, the coefficient of 0.331 (0.254) implies a huge effect yet is not 
significant. In the case of the weakest instrument, C{ x the coefficient of 2,572 (2.442) is 
so large that it cannot be interpreted literally within the linear probability framework, yet 
it is still insignificantly different from zero. 

The results for whites are again fairly similar across specifications, although the precision 
of the estimates now varies with the choice of instrument. In the 2SLS case, both precision 
and the coefficients themselves are relatively constant except when C{ x Di is used as an 
exclusion restriction. It appears that in this subsample, the exclusion restrictions are driving 
a larger share of identification than they were for urban minorities, but that the linear index 
assumption in conjunction with normality is still playing a large role. 

Although the specifications of Table A1 do not involve exact replications of the analyses 
of either Evans and Schwab (1995) or Neal (1997),^^ we believe that they do shed some 
light on the sources of the apparent discrepancies in the results. Table A1 suggests that 
the proximity measures in both of these studies do not play a key role in identification in 
NLS-72, as standard errors in the 2SLS models are prohibitively large in cases in which 
Catholic rehgion is not an excluded instrument. Bivariate probit models can sometimes 
produce misleading results which are consistent with a reasonably exogenous instrumental 
variable, when in fact identification is stemming from an invalid instrument in combination 
with functional form assumptions. In order to isolate the role of each of these factors, it is 
necessary to implement IV strategies that rely on nothing other than exclusion restrictions 

could not replicate Neal (1997) exactly because he used an indicator for whether students attended a 
Catholic high school in the National Longitudinal Survey of Youth that is not available in the public release 
version of the data set. We experimented with NLSY using an indicator for whether the student attended 
public school. We obtained results qualitatively similar to those based on NLS-72, with the bivariate probit 
results being even less sensitive to exclusion restrictions than in NLS-72. 
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for identification. 
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Table 1 



Probit, Bivariate Probit, OLS, and 2SLS Estimates of Catholic Schooling Effects 
NELS:88 and NLS-72 

Weighted, Marginal Effects of Nonlinear Models Reported, (Huber-White Standard Errors in Parentheses) 





(1) 


Excluded Instruments 
(2) 


(3) 




Catholic (Ci) 


Distance {Di) 


Catholics Distance {Ci x Di) 


HS Graduation (NELS:88) 








Probit (controls 


0.065 


0.047 


0.052 


exclude “instrument”) 


(0.025) 


(0.025) 


(0.026) 


Bivariate Probit 


0.128 


-0.007 


-0.022 




(0.032) 


(0.085) 


(0.119) 


OLS 


0.041 


0.021 


0.023 




(0.014) 


(0.014) 


(0.015) 


2SLS 


0.34 


-0.04 


0.09 




(0.08) 


(0.10) 


(0.11) 


College in 1994 (NELS:88) 








Probit (controls 


0.094 


0.085 


0.077 


exclude “instrument”) 


(0.022) 


(0.022) 


(0.022) 


Bivariate Probit 


0.170 


0.103 


-0.043 




(0.055) 


(0.062) 


(0.070) 


OLS 


0.128 


0.119 


0.111 




(0.026) 


(0.026) 


(0.026) 


2SLS 


0.40 


0.31 


-0.11 




(0.10) 


(0.11) 


(0.12) 


College in 1976 (NLS-72) 








Probit (controls 


0.068 


0.070 


0.067 


exclude “instrument”) 


(0.016) 


(0.016) 


(0.016) 


Bivariate Probit 


-0.002 


-0.052 


-0.080 




(0.028) 


(0.035) 


(0.035) 


OLS 


0.071 


0.075 


0.072 




(0.015) 


(0.016) 


(0.016) 


2SLS 


0.06 


0.44 


-0.25 




(0.04) 


(0.20) 


(0.11) 


Notes: 









(1) All models other than univariate probits instrument for Catholic High School attendance {CHi). 

(2) Controls for all NELS:88 models include the demographic, family background, geography, and 8th grade variables listed in Table 3a. Controls 
for all NLS-72 models include the demographic, family background, and geography variables listed in Table 3b. When Di is used as an instrument, 
Ci is included as a control; when is an instrument, D{ is included; and when Di x Ci is an instrument, both D{ and are included. 

(3) Sample sizes: N=8560 (HS Graduation), N=8313 (College Attendance in NELS), N=19,489 (College Attendance in NLS-72) 
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Table 2 



OLS and 2SLS estimates of Catholic Schooling Effects 
NELS:88 and NLS-72 

Weighted, (Huber-White Standard Errors in Parentheses) 

Excluded Instruments 



(1) (2) (3) 





Catholic {Ci) 


Distance {Di) 


Catho Hex Distance {Ci 


12th Grade Reading Score (NELS:88) 
OLS 
2SLS 


1.16(0.37) 

1.40(1.54) 


1.03 (0.37) 
-1.09(1.84) 


1.14(0.38) 

1.24(1.82) 


12th Grade Math Score (NELS:88) 
OLS 
2SLS 


1.03(0.31) 

2.64(1.21) 


1.00 (0.31) 
2.43 (1.45) 


0.92 (0.32) 
-2.63(1.57) 


12th Grade Reading Score (NLS-72) 
OLS 
2SLS 


2.06 (0.34) 
-1.34(0.99) 


2.54 (0.37) 
8.69 (4.53) 


2.50 (0.36) 
0.50 (2.32) 


12th Grade Math Score (NLS-72) 
OLS 
2SLS 


1.52 (0.33) 
-0.07(0.96) 


I. 77 (0.35) 

II. 05 (4.47) 


1.71 (0.36) 
-3.94 (2.27) 



Notes: 

(1) All 2SLS models instrument for Catholic High School attendance (CHi). 

(2) Controls for all models include those described in notes to Table 1 . When D{ is used as an instrument, 
Ci is included as a control; when C{ is an instrument, D{ is included; and when D{ x Cj is an instrument, 
both D{ and are included as controls. 

(3) Sample sizes: N=8,l 66 (NELS 1 2th Reading), N=8,l 19 (NELS 12th Math) 

N=16,276 (MLS Academic Years of School), N=14,671 (NLS Reading and Math scores). 
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Table 3a 



Comparison of Means of Key Variables 
by Value of Distance, Catholic, and their Interaction 
NELS:88 



(1) 


(2) 


(3) 


(4) 


Overall Mean 


Difference by Ci 


Difference by Di 


Difference by Ci x Di 



Demographics 


Female 


0.50 


0.01 


0.00 


0.00 


Asian 


0.04 


0.01 


0.04 


-0.02 


Hispanic 


0.10 


0.19 


0.08 


0.03 


Black 


0.13 


-0.15 


0.08 


-0.13 


White 


0.73 


-0.05 


-0.20 


0.12 


Family Background 


Mother’s education 


13.14 


-0.26 


0.17 


-0.36 


Father’s education 


13.42 


-0.07 


0.17 


-0.31 


Log of family income 


10.20 


0.11 


0.12 


-0.02 


Mother only in house 


0.15 


-0.04 


0.02 


-0.03 


Parent married 


0.78 


0.06 


-0.02 


0.03 


Geography 


Rural 


0.32 


-0.15 


-0.44 


0.05 


Suburban 


0.44 


0.06 


0.08 


0.00 


Urban 


0.24 


0.09 


0.36 


-0.05 


Expectations 


Schooling expectation 


15.17 


0.15 


0.31 


-0.06 


Very sure to graduate high school 


0.83 


-0.01 


0.00 


-0.01 


Parents expect some college 


0.88 


0.04 


0.05 


-0.02 


Parents expect college grad 


0.78 


0.03 


0.06 


-0.04 


Expect white collar job 


0.46 


0.03 


0.06 


-0.01 


8th Grade Variables 


Delinquency Index 


0.69 


-0.05 


0.03 


-0.04 


Got into fight 


0.27 


-0.01 


0.01 


0.05 


Rarely completes homework 


0.21 


-0.05 


0.00 


0.00 


Frequently disruptive 


0.13 


-0.02 


-0.01 


0.00 


Repeated grade 4-8 


0.08 


-0.03 


0.01 


-0.03 


Risk Index 


0.72 


-0.07 


-0.01 


0.01 


Grades Composite 


2.89 


0.04 


0.00 


0.07 


Unpreparedness Index 


10.82 


0.00 


0.08 


-0.09 


8th Grade reading score 


50.32 


0.40 


0.03 


1.15 


8th Grade math score 


50.33 


0.55 


0.45 


0.06 


Outcomes 


10th Grade reading score 


50.16 


0.65 


0.58 


0.60 


10th Grade math score 


50.21 


0.93 


0.75 


-0.50 


12th Grade reading score 


50.40 


0.52 


0.88 


-0.17 


12th Grade math score 


50.38 


1.18 


1.03 


-0.70 


Enrolled in 4 year college in 1994 


0.29 


0.08 


0.08 


-0.05 


HS Graduate 


0.84 


0.07 


0.01 


0.01 


Attended Catholic HS 


0.06 


0.13 


0.12 


0.15 



Notes: 

(1) Difference by C* x D\ is obtained from the coefficient on C\ x £)» in a regression including and Di as controls 

(2) SampIe Size; N-16,070 
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Table 3 b 



Comparison of Means of Key Variables 
by Value of Distance, Catholic, and their Interaction 
NLS-72 



(1) 


(2) 


(3) 


(4) 


Overall Mean 


Difference by Q 


Difference by Di 


Difference by Ci x Di 



Demographics 


Female 


0.50 


-0.01 


0.03 


0.03 


Hispanic 


0.04 


0.11 


0.01 


-0.07 


Black 


0.15 


-0.15 


0.04 


-0.08 


Family Background 


Mother’s education 


12.19 


-0.13 


0.16 


-0.33 


Father’s education 


12.43 


0.06 


0.40 


-0.32 


Log of family income 


8.93 


0.07 


0.11 


-0.03 


Father Blue Collar 


0.24 


0.01 


-0.03 


-0.01 


Low SES Indicator 


0.29 


-0.05 


-0.06 


0.00 


English Primary Language 


0.92 


-0.06 


-0.02 


0.03 


Family Receives Daily Newspaper 


0.88 


0.04 


0.06 


0.01 


Mother Works 


0.50 


-0.06 


0.03 


0.01 


Geography 


Rural 


0.23 


-0.14 


-0.30 


0.05 


Suburban 


0.48 


0.06 


0.02 


-0.04 


Urban 


0.29 


0.08 


0.28 


-0.01 


Expectations 


Decided to go to college pre-HS 


0.41 


-0.01 


0.04 


-0.06 


Outcomes 


Enrolled in college by 1976 


0.38 


0.01 


0.05 


-0.06 


Reading Score 


50.01 


0.30 


0.46 


0.55 


Math Score 


49.98 


0.58 


0.40 


-0.10 


Years of Academic PSE, 1979 


1.61 


0.03 


0.22 


-0.23 


Attended Catholic HS 


0.06 


0.19 


0.07 


0.15 



Notes: 

(1) Difference by Ci x £)* is obtained from the coefficient on Ci x in a regression including Ci and Di as controls 

(2) Sample Size: N=19,921 
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Table 4 

Comparison of 2SLS Estimates^ and Bias Implied by OLS Estimation of VJ = + (jJi 

on the Public Eighth Grade Subsample^; Various Outcomes and instruments; NELS:88 Sample 
Weighted, (Huber-White Standard Errors in Parentheses) 



OUTCOME (y) 


(1) 


INSTRUMENTS {Zi 
(2) 


) 

(3) 




Catholic 


Distance 


Catholic X Distance 


High School Graduation 
Implied Bias in 2SLS (0) 
2SLS Coefficient 


0.34 (0.08) 
0.34 (0.08) 


-0.05(0.12) 
-0.04 (0.10) 


0.15(0.12) 
0.09 (0.11) 


College Attendance 

Implied Bias in 2SLS (0) 
2SLS Coefficient 


0.29 (0.11) 
0.40 (0.10) 


0.37(0.12) 
0.31 (0.11) 


-0.23 (0.13) 
-0.11 (0.12) 


12th Grade Reading Score 
Implied Bias in 2SLS (0) 
2SLS Coefficient 


0.54(1.68) 

1.40(1.54) 


-0.51 (2.08) 
-1.09(1.84) 


-0.50(1.99) 

1.24(1.82) 


12th Grade Math Score 
Implied Bias in 2SLS (0) 
2SLS Coefficient 


1.85(1.41) 

2.64(1.21) 


1.83(1.69) 
2.43 (1.45) 


-4.37 (2.06) 
-2.63 (1.57) 



Notes: 

(1) Controls for all models include those described in notes to Table I. In Column \ , D is included as a control; in Column 2, 
Ci is included as a control; and in Column 3, both D{ and C{ are included as controls. 

(2) The model Yi — X^'y + + uJi is estimated by OLS using the NELS:88 sample of those who attended public 

eighth grade schools. Sample sizes: N=7,701 (HS Graduation), N=7,48I (College Attendance), N=7377 (12th reading), 
N=7380 (12th math). A is the coefficient on Z{ in the first stage equation for C Hi. The sample sizes for the first stage 
equations are listed in Tables 1 and 2 for the various outcomes. The 2SLS coefficients are from Tables 1 and 2. 

(3) Reported standard errors of *0 account for the fact that A is previously estimated from a model ofCH{ attendance. 
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Table 5 



Estimates of Catholic Schooling Effects and Estimates of Potential Bias 
Using AET Methodology, NELS:88 
Weighted, (Huber-White Standard Errors in Parentheses) 
Excluded Instruments 



(1) 


(2) 


(3) 


Catholic 


Distance 


Catholicx Distance 



HS Graduation 
2SLS Coefficient 
Bias 1 
Bias 2 


0.34 (0.08) 
0.52 (0.23) 
0.84 (0.26) 


-0.04 (0,10) 
0.15(0.16) 
0.06 (0.14) 


0.09 (0,11) 
0,14(0,24) 


College in 1994 
2SLS Coefficient 
Bias 1 
Bias 2 


0.40 (0.10) 
0.45 (0.21) 
0.45 (0.21) 


0,31 (0.11) 
0,46 (0,22) 
0,40 (0.20) 


-0.11 (0,12) 
0,15 (0.26) 


12th Reading Score 
2SLS Coefficient 
Bias 1 
Bias 2 


1.40(1.54) 

1.18(1.06) 

1.42(1.07) 


-1.09(1.84) 
2.49(1.59) 
2.11 (1.40) 


1.24 (1.82) 
2.59 (1,14) 


12th Math Score 
2SLS Coefficient 
Bias 1 
Bias 2 


2.64(1.21) 
2.02 (0.75) 
1.87 (0.74) 


2.43 (1.45) 
1.76(1.03) 
1.72 (0.98) 


-2,63 (1.57) 
1.42 (0,88) 



Notes: 

(1) Controls included are described in Table 1 notes. 

(2) Sample sizes: N=8560 (HS Graduation), N=8313 (College Attendance in NELS), N=8,166 (12th Reading), 
N=8, 199 (12th Math). 

(3) “Bias 1” calculations use all variables, while ’’Bias 2” excludes Di and Ci in the bias calculations. 

(4) Standard Errors of the bias calculations obtained from a 100-replication bootstrap 
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Table A1 

Comparison of Linear and Non-Linear Models of College Attendance in NLS-72 
(Standard Errors in Parentheses) 

[Marginal Effects of Non-Linear Models in Brackets} 



Sample 



Non-whites in cities (N=1532) 


Whites in cities (N=5326) 


Nonlinear 


Linear 


Nonlinear 


Linear 


Models 


Models 


Models 


Models 


(Pro bits) 


(OLS/2SLS) 


(Probits) 


(OLS/2SLS) 



Single Equation Model 


0.640 


0.239 


0.253 


0.093 


(OLS/Probit) 

Two Equation Models: 
Excluded Instruments: 


(0.198) 

[0.239] 


(0.070) 


(0.062) 

[0.093] 


(0.022) 


%CCHi and CHjPi 


1.471 


1.375 


0.048 


0.115 




(0.442) 

[0.517] 


(0.583) 


(0.250) 

[0.018] 


(0.158) 


Ci and %CCHi 


0.879 


0.054 


-0.090 


-0.036 




(0.523) 

[0.329] 


(0.309) 


(0.121) 

[-0.033] 


(0.050) 


Cu %CCHu and CH/Pi 


1.106 


0.331 


-0.085 


-0.034 




(0.460) 

[0.409] 


(0.254) 


(0.118) 

[-0.031] 


(0.048) 


Ci only 


0.761 


-0.093 


-0.133 


-0.056 




(0.543) 

[0.285] 


(0.324) 


(0.130) 

[-0.049] 


(0.054) 


Ci xDi 


1.333 


2.572 


-0.121 


-0.395 




(0.516) 

[0.478] 


(2.442) 


(0.262) 

[-0.044] 


(0.169) 


None 


1.224 




-0.094 






(0.542) 

[0.446] 




(0.301) 

[-0.034] 





Notes: 

(1) Sample is taken from counties in the NLS-72 which had a population of greater than 250,000 in 1980. 

(2) Al! equations control for parents’ education and income levels and SES, whether father is a blue-collar worker, county 
population, gender and race. 

(3) The Instrument refers to the percent of the county which reports they are Catholic church members, 

and “C Hj to Catholic schools per person in the county. 
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